Localization of Denaturation Bubbles in Random DNA Sequences 
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We study the thermodynamic and dynamic behaviors of twist-induced denaturation bubbles in 
a long, stretched random sequence of DNA. The small bubbles associated with weak twist are 
delocalized. Above a threshold torque, the bubbles of several tens of bases or larger become 
preferentially localized to AT-rich segments. In the localized regime, the bubbles exhibit "aging" 
and move around sub-diffusively with continuously varying dynamic exponents. These properties 
are derived using results of large-deviation theory together with scaling arguments, and are verified 
by Monte-Carlo simulations. 



I. INTRODUCTION 

Localized opening of double-stranded DNA is essential 
in a number of cellular processes such as the initiations 
of gene transcription and DNA replication While 
thermal denaturation is highly unlikely under physiolog- 
ical conditions, in vitro experiments show that local de- 
naturation can be readily induced by under-winding the 
DNA double helix by an amount that is physiologically 
reasonable H,H,Q. The basic physical effect is simple: 
An under-wound double helix suffers a reduction in the 
binding free energy 0, IE • Local openings of the dou- 
ble helix (referred to as "denaturation bubbles") relieve 
the twist experienced by the remainder of the double he- 
lix and is thus energetically favorable. The denaturation 
bubbles may be recruited to a specific location of the 
genome by a designed (e.g., AT-rich) sequence, since AT 
pairs bind more weakly than GC pairs On the other 
hand, entropic effect which favors bubble delocalization 
is non-negligible for long sequences. Also significant is 
the kinetic trapping of the bubbles due to statistical ag- 
glomeration of AT-rich segments in long heterogenous se- 
quences. 

To gain some quantitative understanding on the com- 
peting effects of entropy and sequence heterogeneity, 
we characterize in this study the thermodynamic and 
dynamic properties of denaturation bubbles in a long, 
stretched random DNA sequence with no special se- 
quence design. Previously, there have been a number 
of experimental and theoretical studies 0, 0, 0, 0| on 
the effect of sequence heterogeneity on DNA melting and 
unzipping transitions. Our study is along this general 
direction. The specific behaviors exhibited by the de- 
naturation bubbles are rather complex and are typical of 
those observed in systems dominated by quenched disor- 
ders : The bubbles are localized upon increase of the 
applied torque beyond a certain threshold. In the local- 
ized regime, their dynamics exhibits "aging" [T^ ITsI l and 
is sub-diffusive with continuously varying exponents. 

Interestingly, twist-induced denaturation presents a 
rare physical example of the celebrated random-energy 



model of disordered system 0. Consequently, detailed 
analysis of both the thermodynamis and dynamic prop- 
erties can be made by app lyin g the well-developed the- 
ory of disordered systems together with exact re- 
sults from large-deviation theory familiar in the related 
sequence alignment problem [TtL fl^ . We will draw upon 
detailed experimental knowledge of thermal denatura- 
tion [19I l2(i I2H throughout the analysis, and make our 
results quantitative whenever possible. 



II. THERMODYNAMICS 

Let us consider the application of a torque which 
under-winds a long, stretched^ piece of double-stranded 
DNA. We are interested in the regime where the applied 
torque T is below the threshold 7d for bulk denaturation, 
but sufficiently strong so that denaturation bubbles do 
appear in the system. Due to the highly cooperative na- 
ture of the denaturation process, the typical distance Nx 
between the large bubbles is large, in which case treating 
the bubbles as a dilute gas of particles is appropriate. 
Our strategy will be to first characterize analytically the 
thermodynamic behavior of a single bubble, and then 
use this knowledge to determine the length scale A^x and 
the many-bubble states for N ^ Nx ■ We will find that 
A^x > 0(10'^) bp as long as we are not very close to the 
threshold 7d , so that the dilute gas approximation is rea- 
sonable for a large range of parameters. 



A. The Single-bubble Model 

Consider a denaturation bubble confined in a DNA 
double helix between two complementary DNA strands 
of N bases each. The double-strand is denoted by the 



^ A modest stretching force is needed to prevent the applied torque 
from being absorbed by super-coiling; see e.g., Ref. 
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base sequence 61&2 ■ • ■ ^Af [with bk G {A, C, G, T}] of one of 
the strands, ordered from the 5'- to 3'- end. To simpHfy 
the notation, we assume that the two ends of the hehx 
is sealed, so that the bubble is always contained in the 
segment &i . . . 6jv- Let the index of the first and last open 
pairs of the open bubble be m and n, with 1 < m < n < 
N. We denote the total free energy of the bubble (defined 
with respect to the helical state) by AGl {m) , where L = 
n ~ m + I is the number of open bases and referred to 
as the bubble length. Then the partition function of the 
single-bubble system is given by 

N N-L + 1 

Z{N)^Y. E e-^^^'^W, (1) 

L=l m=l 

where I3~^ = ksT ~ 0.62 kcal/mole at 37°C. 

In the absence of the external torque, the bubble en- 
ergy AG'L(m) has two components. First, there is the 
loss of stacking energy 6Gb,b' between two successive 
bases b and b' . These stacking energies are in the range 
0.5 ~ 2.5 fcsT's at 37°C, with the AT stacks weaker than 
the GC stacks. Their values have been measured care- 
fully [SIM EH. Second, assuming that there is no sec- 
ondary pairing between bases in the bubble so that the 
open configuration can be regarded as a polymer loop, 
there is a well-known polymeric loop entropy cost 

7l = 71 + • fc-B^ InL (2) 

for a bubble of length L, with a « 1.8 for a linearly 
extended ^ DNA chain. The bubble initiation cost 71 
depends on the base composition at opening and closing 
ends, ionic strength etc., and generally lies^ in the range 
3 ~ 5 fc^T. For relevant bubble sizes of few tens of bases 
in length (see below), the total entropic cost is 7^ — 
8 12 ksT. This large cost justifies the single bubble 
approximation (at least to the length scale ^ e^""^ ^ 
5 X 10^ bp), and contributes significantly to the s harp ness 
of the observed thermal denaturation transition pH . 

An applied negative torque T reduces the thermody- 
namic stability of the helical state relative to the dena- 
tured one by an amount equal to the work done to un- 
wind the helix. This effect is simply modeled here by 
a linear decrease in the stacking energy in the relevant 
parameter range i.e., SGb.b' 5Gb.b' ~ QoT^ , where 
6*0 = 27r/10.35 is the twist angle per base of the double 
helix. Putting the above together, we have 

m+L 

AGl(to) =7L+ ^ (5G6,_,,;„ -f?or-L (3) 

k—va 



^ The value of a may well be different for unstretched DN A chain 
and hence relevant for the thermal denaturation of DNA l23t |24| . 
However, as we show below, essential features of denaturation do 
not hinge on the precise value of a. 

^ The initiation cost for DNA bubbles are extracted from the 
webserver: http://www.bioinfo.rpi.edu/applications/mfold/ (M. 
Zuker, private communication). See also Ref. ^3 for alter- 
native source. 



as the single bubble energy, which can be computed once 
the DNA sequence bi.-.hjsi is given. Note that while 
Eq. is formulated specifically for twist-induced de- 
naturation, the general form can be used to describe a 
number of destabilizing effects, e.g., due to changes in 
temperature, ionic concentration, etc. 



B. Sequence Heterogeneity 

As the torque T increases from zero towards the de- 
naturation point 7d, denaturation bubbles appear in the 
double strand and grow in size. We wish to know whether 
the bubbles are free to diffuse along the double strand, 
or are they localized in the high AT regions of the DNA 
where binding is the weakest. For simplicity, we will 
characterize the typical behavior of an ensemble of ran- 
dom (i.e., independent and identically distributed) se- 
quences described by the single-nucleotide frequencies pb, 
although our approach and qualitative findings are also 
applicable to sequences with short-range correlations. 

For a given sequence of bases, the partition function 
Z can of course be efficiently evaluated numerically (in- 
cluding all the multiple-bubble states) by using available 
programs such as MELTSIM AH thermodynamic 

quantities can subsequently be evaluated from the free 
energy F — —kBTln Z . To obtain the typical behavior 
of the ensemble, we ideally want to compute the ensem- 
ble average of the free energy, F = — fcsTlnZ. [We use 
the over-line to denote average over the ensemble of ran- 
dom sequences, i.e., X = E&i,...,6„ ^bi,...,bjv DaLi Pb^ ; 
this is also known as the "disorder average".] Comput- 
ing F numerically however will require explicit genera- 
tion of a large number of random sequences and can be 
very time consuming for large A'^'s. Fortunately we can 
apply a large body of knowledge accumulated from the 
statistical mechanics of random systems pj| and provide 
a detailed characterization of the typical behavior of our 
system without the need of exhaustive simulation. To 
introduce notation and concepts in this approach, we ex- 
amine first the simplified problem of a single bubble with 
a fixed length. 



C. Bubble with Fixed Length 

Let us consider a bubble with a fixed length L (with 
1 ^ i <C N) embedded in a long, random sequence 
bi...bN. The partition function reads 

N-L+l 

Zl= exp[-/3AGL(m)], (4) 

m— 1 

where the scripted variables refer to properties of the 
fixed length bubble. For a random sequence, the energies 
of the different states labeled by m are uncorrelated with 
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each other beyond the distance L. Such systems belong ^ 
to the class of "Random Energy Model" (REM) and was 
solved exactly in the 80's by Derrida 0| for a Gaussian 
distribution of AG's. Discrete distribution of AG's was 
studied in the cl osely -related system involving protein- 
DNA interaction |2y|. Below, we will briefly review the 
salient properties of REM using the present example. 

The REM has a "high-temperature" phase where many 
(of order N) bubble configurations contribute signifi- 
cantly to the partition sum, and a "low-temperature" 
phase dominated by only one or a few lowest energy 
states. It follows that, in the former case, the bubble 
is delocalized and can freely diffuse along the sequence, 
while in the latter case, the bubble is localized to the low- 
est energy position. Transition between the two phases is 
driven by competition between the energetic (variation in 
AG) and entropic (In A^) effects. In the present problem, 
the magnitude of terms in the partition sum can be 
tuned not only by varying the temperature, but also by 
varying the bubble size L. Hence, at a fixed /?, whether 
a bubble is free or localized depends both on the bubble 
size L and the sequence length N. 

An interesting property of the REM is that, in the 
"high-temperature" phase, Z^/N tends to a finite limit 
given by the annealed average Z^/N a.s N — > oo, in- 
dependent of the particular realization of the random 
sequence. This allows us to replace the average free 
energy J-' = —ksTliiZ by its annealed approximation 
T = — ksT In Z, which is much easier to calculate. [We 
will use the tilde to denote all quantities computed in 
the annealed approximation.] Introducing a 4 x 4 matrix 
M(/3) with components M;,^;,' — y/pbPb' exp[— /? SGtM'], 
and let the largest eigenvalue of M(/3) be A(/3), then the 
disorder average of terms in can be written as 



Tr M^(/3) A^(/3). 



It is convenient to introduce the quantity 

/(/?) = -/?-' In A(/3), 
with which we have (for N L) 



— Ne-I^'^^ r 
2^ = —^ 



,-/3[/(/5)-eor] 



(5) 
(6) 

(7) 



Hence, in the delocahzed phase, T T ^ —kBTluN + 
L[f{P)-eoT]+jL. 

The annealed entropy can be calculated from J-', with^ 



S = ~ks'—^lnN-P[fiP)-smL, (8) 



* The correlation in AG between neighboring states is only a minor 

complication because it is short-ranged and can be transformed 

away by coarse graining. 
^ To focus on the positional entropy, we did not include here the 

contribution due to loop entropy, i.e., we treated 7^ as an energy 

term despite its entropic origin. 



where e(/3) = — ^ In A. It will also be useful to introduce 
the relative entropy per base for the fixed-length bubble, 



n{f3) = [lnN- S]/L = /3[/(/3) - £(/?)]. 



(9) 



Note that being the difference between / and e, the quan- 
tity 7i is a measure of the intrinsic variation in the bind- 
ing energies (5G's for a random sequence with nucleotide 
frequency pb, and is independent of the average binding 
energy SG which external environment such as the tem- 
perature or solvent conditions most directly affect. 

Derrida^s solution of REM shows that the annealed 
entropy S vanishes at the transition to the "low- 
temperature" phase, beyond which the annealed approx- 
imation is no longer applicable. Using Eqs. (|SJ| and 
we can write the condition for phase transition as 



Lioc = lniV/H(/?), 



(10) 



which gives the minimal bubble size for localization at a 
given N. With the values of SG's obtained from Ref. 
at [Na'^] = 1 M and 37°C, and assuming an equal nu- 
cleotide distribution (i.e., pb — 1/4 for all bases), we 
find / w 1.83 ksT, e w 1.50 /cbT, so that H « 0.33 and 
Lioc w 20 bp for N ^ 10^ bp. From Eq. it is clear 

that as N 00 any fixed-length bubble remains delocal- 
ized. 



D. Bubble without Length Constraint 

The full partition function Z is obtained simply 
by summing Z^ for different L's. We will again 
approach the problem by first applying the annealed ap- 
proximation and then determining where it breaks down. 

1. Annealed approximation: The annealed partition func- 
tion Z{N) = X]l=i -^l has a transition at 7^ = f{0)/do, 
where the exponential factor in Q reaches one: The sum 
over L is finite and Z oc N only if T < 7^. In this regime, 
the annealed free energy is simply F « —kBTlnN + ji. 
The annealed energy E = —-^lnZ{P) is also readily 

computed; it can be expressed as i? = 71 -I- [e{f3) — OqT] ■ L 
where L{T) = J2l=i°^^l/Z is the average bubble 
length in the annealed approximation. As T approaches 
7^, L{T) diverges, and the annealed entropy 



S^lTiN~p[doT--e{l3)]-L 



(11) 



becomes negative. 

In the limit N ^ 00, the anealed free energy F is 
actually identical to F for all T < 7^. This can be seen 
from the inequalities In 2^=1 < In Z < In Z, and Z^^i > 
A''min{exp[— /3AGi]}. Since both the lower and upper 
bounds grow as In TV, 



F = -keTlnZ ~ -~kBT\nN 



(12) 



for all T <%. 
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2. Ground-state properties: To find the ground-state of 
the unconstrained bubble in a long random sequence, we 
need to study the statistics of stretches of exceptionally 
high AT-content. If we neglect the polymeric contribution 
7l to the bubble energy [to be justified shortly], then 
the ground-state energy E* expected in a sequence of 
length TV can be computed exactly from large-deviation 
theory with 

W{N) K -X^^lnN. (13) 

The constant A in Eq. (|13|l can be expressed as the unique 
positive root of the equation 

/(A) = 9,T, (14) 

where / is defined by the (5G's through Eqs. © and 
Note that, at T = %, Eq. is satisfied with A /3. 
In this case, ((T^ coincides with (IT^ . 

The length of the minimal energy bubble is also known 
from large-deviation theory |l8l l27j , with 

L*{T) ^lnN/H*{T), (15) 

where the relative entropy H* is given exactly by 

H*{T)^\{T)-[ea-e{X)]. (16) 

From the logarithmic dependence of the bubble length 
L* on iV, it is clear that the corresponding polymeric 
contribution 7^. ^ In(lniV) can indeed be treated as a 
constant shift of bubble energy. 

3. Phase transitions: Based on the above discussion, a 
phase transition can be formally established in the limit 
N ^ 00. This is seen by comparing the expressions 
((T^ and (CSl- For T > %, solution to satisfies 
A < /3. Consequently, H12|) must break down there, yield- 
ing a phase transition at T ~ Since _F < ii^* in 
general (e.g., for all T > and at the phase transi- 
tion point T = the equality F = E* already holds, 
i.e., the ground-state already dominates, then we must 
have the ground-state dominating throughout the local- 
ized phase. This is exactly the behavior of the random 
energy model [l6|. 

A physical understanding of the transition can be ob- 
tained by examining the importance of the ground-state 
contribution exp(— /3i?*) ~ N^/"^ to the partition sum Z 
as the applied twist T is varied. For T < the ratio 
f3/\{T) is less than one. In this case, the energy gain 
E* (N) of placing the bubble at the site of the lowest en- 
ergy is insufficient to overcome the entropy In N of plac- 
ing the bubble in different positions, hence the bubble is 
typically small and delocalized. When T exceeds the 
ground-state contribution grows faster than N, signalling 
dominance of one or a few low energy states where the 
bubble typically resides. The transition is thus identified 
as the localization transition of the bubble at T\oc = 

The onset of the zero entropy point can be obtained 
from Eq. Hll|l and written as 

InN = Hil3,T)-LiP,T), (17) 



where 

H{f],T)^f3-[9oT-e{(3)] (18) 

is the relative entropy of the unconstrained bubble. 
These equations are analogous to the expressions ((T3|) 
and H16|l for the ground-state bubble. In fact, both 
L*(T) and H*{T) are reproduced through the substi- 
tution (3 A(r), e.g., L*{T) = L{X{T),T). This turns 
out to be true also for other thermodynamic variables. 
Thus the localized phase at different T's can be viewed 
as the phase transition points of systems with different 
effective temperatures A~^(T); this will be clearly mani- 
fested in the bubble dynamics discussed below. 

Next we observe that since H* cx A [see Eq. (|16|) ]. the 
bubble length diverges (or approaches N) as A 0. This 
defines the point of bulk denaturation^ 7d, i.e., 

00%= Vim f{X)=m, (19) 

where the second equality is obtained from manipulating 
Eqs. © and ®. Using W w 1.40 ksT [derived from 
the (5G's in Ref. [H], we find % « 10 pN • nm. The 
dependence of A on T close to can be obtained from 
the expansion 

/(A) 5G- ^var((5G) -f 0(A2). (20) 

Inverting the above for A using (|14|l and (|19|) . we find 

^('^^ = W^lJrS - ^) + ^(^^ - (21) 
pvar(()G) 

It turns out that the term linear in 7d — T in 121|) al- 
ready gives a very good approximation (to within 1%) 
of A throughout the localized phase where X/(3 < 1. 
The localization transition point Tioc can be thus ob- 
tained by solving Eq. with A(7ioc) = P- Using 
(3'^vai{SG) « 0.565 [derived from Ref. T^], we find 
%i — Tioc ~ 2 pN • nm. Unlike the value of which is 
derived from the average stacking energy 5G and hence 
sensitive to temperature, ionic strength, etc., the differ- 
ence 7d — Tioc is set by the variance of SGb.t' and should 
be much less sensitive to experimental conditions. The 
same is expected for the relative entropy, which has the 
form 

H*{T) « 2el{Xi - Tf/vaT{SG) (22) 
throughout the localized phase. 



^ Note however that the hehcal segments separating adjacent bub- 
bles can be stable even beyond 7d, so that complete separation 
of the two strands takes place at T > Tj. 
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E. Multiple bubbles 



A. Model 



The localization transition discussed above occurs only 
as — > oo. However for large N, the single-bubble ap- 
proximation will break down regardless of the large (but 
finite) bubble cost jl- When multiple bubbles are local- 
ized, each bubble is effectively in a finite-length system, 
thereby blurring the localization transition. 

We first analyze the delocalized phase for which the an- 
nealed approximation is valid. Once multiple bubbles are 
allowed in the system, we expect a broad range of bub- 
ble lengths, as described by the distribution Q. Qualita- 
tively, we expect only the largest bubbles, of size L{T) to 
be localized as T — s- Tjoc, while the smaller ones remain 
delocalized. We shall thus focus on these large bubbles: 
It is the average separation distance Nx between these 
large bubbles that sets the effective system size of the 
single-bubble localization problem. 

The Boltzmann weight of one such large bubble in a 
sequence of length TV > L is W{N) = e-^'^^TV/L" in the 
vicinity of the localization transition. Setting W{N) = 1 
yields the typical spacing between the large bubbles on 
the delocalized side. 



Nx 



iL"(r). 



(23) 



Note that for bubbles of size 10 bp's, the crossover length 
is already of the order of 10'^ bp's. A similar estimate can 
be made on the localized phase using the exact expres- 
sion for the lowest energy for multiple bubbles. We 
find 



Nt 



(24) 



as the average distance between large bubbles of size L*. 

For N ^ iVx, the system consists of N/Nx effec- 
tive number of single-bubble subsystems, each of length 
Nx ■ At the localization "transition" of an infinite 
system then, we have IniVx = H{(3,Tioc)L{'Tioc) [see 
Eq. lOl. Together with Eq. ^ [or ^ with X = (3], 
we find L(7ioc) ~ 25 bp at the onset of localization (us- 
ing 71 ~ 3 ksT and H w 0.33), with the crossover length 
Nx ~ 6500 bp. Thus we expect there to be typically one 
bubble of ^ 25bp in a random DNA double-strand of 
length ~ 6500 bp's at the localization transtion. 



III. BUBBLE DYNAMICS 

The localization of bubbles is reflected ultimately in 
their slow dynamics. We expect bubbles to diffuse freely 
along the DNA double-helix in the delocalized phase, but 
become trapped in low energy positions in the localized 
phase. Details of the bubble movement in the latter case, 
however, can be rather complicated with nontrivial mem- 
ory (or "aging" ) effects typical of glassy states |3 Q as 
will be described below. 



For simplicity, we will restrict ourselves to the descrip- 
tion of the movement of a single-bubble over its lifetime, 
which can be rather long in the localized phase. For rea- 
sons discussed above, interaction with other bubbles can 
be neglected when the bubble displacement is within a 
distance of order iVx ~ 10'^ bp. We will also neglect the 
polymeric loop entropy 7^ which provides essentially a 
constant shift to the bubble energy as shown in the sin- 
gle bubble section. 

In addition to the drift and breathing motion, a bubble 
may also shrink to zero size and disappear from the sys- 
tem. To our knowledge, the time scale involved for the 
spontaneous collapse of a bubble, particularly under an 
applied twist, has not been documented so far. Zipping 
the bubble requires not only pairing of the bases in the 
open segment, but also rewinding of the helix against the 
applied undertwist, both of which contribute to the en- 
ergy barrier to the no-bubble state. This suggests a long 
lifetime for a bubble, which can be enforced by setting 
a lower bound (e.g., 10 bp) in the allowed bubble length. 
However, as we will see, the long-time behavior of bub- 
ble dynamics is determined crucially by the occurrence 
of the large bubble states, and insensitive to the value 
of the lower bound on L, as long as the L — state is 
excluded. Once accurate estimates of bubble lifetime be- 
come available, one may supplement the discussion below 
with such a cutoff. 



B. Scaling Theory 

Equation l|13() gives the lowest energy of an uncon- 
strained bubble in a sequence of length N, while a bub- 
ble with its position (but not size) fixed typically has 
an energy of the order A^^ for T < Tjoc. For small A, 
the energy variation AE{N) ~ A^^ In is large, hence 
the bubble dynamics is dominated by the thermal es- 
cape from the deepest trap. The escape time is thus 
U{N) ~ e''^-^(^) - 7V^/^, i.e., the dynamics is sub- 
diffusive deep in the localized phase (where /3 3> A) . 

To investigate the dynamical behavior in more de- 
tail, especially close to the localization transition where 
Aw/?, we need to include also the random motion of 
the bubble along the double strand. Towards this end, 
it is useful to describe the bubble dynamics as a sin- 
gle point moving in the two-dimensional space spanned 
by the bubble's only two degrees of freedom, its instanta- 
neous length L and the position of one of its ends, say m. 
The statistics of the two-dimensional energy landscape 
AGrAjn) is well-characterized by the large-deviation the- 
ory • It consists of a number of valleys, whose depths 
(denoted by AG's) are given by the Poisson distribution 



V 



AG 



-A AG 



(25) 



where A is the constant defined through H14|) . The typical 
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valley length is L '-^ ^/H*, where H* is given by (|16|l . 
The valleys are spread out along the corridor at L < L, 
separated by a typical distance AI which is also calculable 
from the large-deviation theory. For much larger L's, the 
bubble energy becomes prohibitively high. 

Clearly, the dynamics consists of two parts: At short 
times, it is dominated by the escape of the bubble out 
of an individual valley, and is analogous to the (biased) 
Sinai problem At longer time scales, the bubble 

"hops" from one valley to another along the corridor of 
valleys. This dynamics, which is essentially that of a 
particle traversing a series of exponentially-distributed 
energy valleys [see Eq. (|25ll ]. has been extensively inves- 
tigated previously in the context of the one-dimensional 
"trap" model [Sfl IMl . Here we review some key results 
and refer the readers to Ref. 1^3 for details. The basic 

quantity is the time r(AG') oc e'^l^'^l to escape each val- 
ley of depth AG. The average time to traverse K valleys 
over a length scale N — K ■ M by random walk is then 
given by 

t^^K^ ■ (r) ^ dx t{x) V{x), (26) 

where (r) is the average of the trap time t(AG), and the 
limits of integration in H26|l are from the magnitude of 
the typical valley depth A^"'^ to that of the deepest valley 
(|13|l expected for a segment of length TV. 

The total time according to Eq. H26() can be written as 
tc{N) cx N^, with the dynamic exponent z given by 

_ / 2 for A > /? (or T < Tioc) 

^ ~ \ 1 + /3/A for A < /3 (or Tioc < T < Td) 

(27) 

The anomalous exponent z > 2 in the glass phase shows 
explicitly that the dynamics is slow, i.e., sub- diffusive. 



C. Glassy Dynamics 

We next report the result of a Monte-Carlo simulation 
of the bubble dynamics on predefined random nucleotide 
sequences. We impose local dynamics in which the bub- 
ble can only change its length L or shift its end position 
TO by a single base, as long as L > 1. To remove edge 
effects and probe the asymptotic dynamics, we use a very 
large sequence length (> 10* bp) so that the bubble never 
reaches the boundary of the sequence given the duration 
of our numerical study. All disorder-averaged quan- 
tities reported are performed over 10* random sequences. 

1. Anomalous dIfFusion: To characterize the slow dynam- 
ics quantitatively, we show in Fig. ^a) the time evolu- 
tion of the average displacement R{t) = \m{t) — to(0)| 
of the bubble position for a few selected values of T's 
in the glass phase. Clearly, the displacement can be de- 
scribed by a power law of the form R(t) cx , where we 




FIG. 1 (a) Average bubble position vs time for various values 
of T's in the glass phase: the solid lines are power law fits, (b) 
The extracted exponents vs T: the solid line is the prediction 
of the scaling theory Eq. 1271 . 

expect v = 1/z. In Fig. ^b), we plot the extracted ex- 
ponents (circles) for different values of T's in the range 
Tioc < T < 7d. The expected values 1/z according to 
Eq. (|27|l (using the linear expression in H21I) for A) is 
shown as the solid line for comparison. We note that the 
observed exponents follow the general trend predicted, 
changing continuously from 1/z — 0.5 close to the ex- 
pected location of the glass transition (Tioc ~ 0.87d), to- 
wards zero as T — > 7d. For T close to 7^, the dynamics 
becomes exceedingly slow, making it difficult to access 
the asymptotic region. For T « Tioc, we also observed 
some finite-size effect. The overall agreement between the 
scaling theory and numerical results is within 5 10% 
over the range tested. 

In Fig. |2Ia), we show the dependence of the average 
bubble length on time for different T's. The data de- 
pict the slow, logarithmic growth of the bubble length. 
Logarithmic growth is one of the signatures of glassy dy- 
namics. Its occurrence in this particular system can be 
understood quantitatively as follows: The optimal bub- 
ble size L*{N) in a segment of length N depends loga- 
rithmically on N; see Eq. (|15ll . On the other hand, for a 
bubble placed at an arbitrary position in a long sequence, 
the effective sequence length is the distance the bubble 
can explore within a time t, i.e., N ~ t*/^ for the sub- 
diffusive dynamics expected in the glassy regime. Hence, 

L*{t)K — - — Ini-l-const. (28) 
z ■ H* 

is the expected length of the optimal bubble within a time 
t. Generally, we expect L*{t) to be the upper bound of 




FIG. 2 (a) Average bubble length vs time for several T's; the 
solid lines are fits to the form L{t) = a + b \nt. (b) Slope b of 
the logarithmic time-dependence of L{t) for various T's. The 
solid line is the corresponding quantity for the upper bound 
of the bubble length L*{t)\ see text. 
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the observed bubble length L{t), with L(t) « L*{t) for 
large t deep in the glass phase. However below the glass 
transition, L{t) must be finite even for t — > oo. 

In Fig. [2Ib), we show the coefficients of the observed 
logarithmic time dependence of L{t) for T's throughout 
the range Tioc < T < 7d. Also shown is the upper 
bound l/{z ■ H*) (solid line) according to (|28|l . using 
the expression (I22|) for H* . We note that the difference 
between the data (circles) and the upper bound is nearly 
constant (« 1) for the range studied. 

2. Aging: Perhaps the most characteristic feature of 
glassy dynamics is that the system "ages", e.g., the tem- 
poral fluctuation of the system depends on how long the 
system has evolved from some (arbitrary) initial condi- 
tion [T3. ITsIl : The longer it has evolved, the slower it 
fluctuates. This is easy to understand in the context of a 
rough energy landscape with deep valleys and high bar- 
riers, since the longer the system evolves, the deeper the 
energy valley it finds, and hence the higher the barrier it 
will have to overcome to travel farther. This feature is in 
marked contrast to sub-diffusive hydrodynamic systems 
which are time-translationally invariant. 

Quantitatively, we can define the aging phenomenon 
via the time-dependent correlation function C (iw , At) , 
which measures how much the system changes in time At, 
after first evolving for a waiting period from the initial 
condition. Let us define a binary variable rji{t) e {0, 1}, 
for each base i of the nucleotide sequence. 'qi{t) takes on 
the value 1 if base i is open and belongs to the bubble at 
time t, and the value if base i is paired. The correlation 
function, defined as C(tw, At) = ^i'qi{t^)rji{t^ + At) 
after averaging over 10000 random sequences, is a mea- 
sure of the average self-overlap of the bubble at time tw 
and tw + At. A more convenient quantity to charac- 
terize is the fraction of overlap, C(tw, At)/L(tw), where 
L{t) = J2i Vi{t) is the instantaneous bubble length. 

In Fig. Ola), we show the overlap fraction, parame- 
terized by the different waiting time t^'s for the system 
biased deep in the glass phase with T = 0.9 T^. The 
overlap fraction clearly depends on the waiting time, il- 
lustrating the glassy nature of the dynamics. In con- 
trast, the same quantity computed for T < (data not 
shown) gives no statistically significant dependence on 
tyj. To characterize more quantitatively the behavior, we 
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FIG. 3 Aging plot; (a) Average overlap fraction 
C(tw, At)/L(tw) for different tw's (from Ix to 512x 25,600 
Monte-Carlo steps) deep in the glass phase, with T — 0.97d > 
Tioc; (b) Scaling plot of (a) with At normalized by tw. 



re-plot in Fig. ^h) the curves in (a) with At normal- 
ized by ty,. We find these curves to collapse reasonably 
onto a single master curve which exhibits a weak kink at 
At/tw ~ 1. A naive explanation of this behavior is that 
for At <^ tw, the bubble stays approximately within the 
energy valley found at time t^, while for At ^ t^, the 
bubble makes excursion far away from the valley. For 
the one-dimensional "trap" model, it was shown rigor- 
ously [s^ that C(tw,At) indeed scales as a function of 
At/tw, even though the largest trap time actually scales 
sub- linearly with t^. This behavior can be understood 
in terms of the particle making multiple returns to the 
original valley after escaping it ||32] , as manifested by the 
slow decay shown in Fig. I^Jb) for At » t„. 



IV. DISCUSSION 

In this study we investigated the thermodynamic and 
dynamic behaviors of twist-induced denaturation bub- 
bles in a long, random sequence of DNA. The small bub- 
bles associated with weak twist are delocalized, e.g., they 
flicker in and out of existence according to the Boltzmann 
distribution and are independent of the DNA sequence. 
The bubbles increase in lengths upon increase in the ap- 
plied torque. When the largest bubbles reach a critical 
size L\oc which is of the order of a few tens of bases, the 
bubbles become localized to AT-rich segments which oc- 
cur statistically in a long random sequence. According 
to the parameters [l^ taken at 37°C with [Na+] = 1 M, 
the localization "transition" occurs at Tfoc ~ 8 pN • nm, 
which is ~ 80% of the torque needed for bulk denatu- 
ration 7d. In the localized regime, the bubbles exhibit 
"aging" and move along the double helix sub-diffusively, 
with continuously varying dynamic exponents. 

All of the results are obtained under the single-bubble 
approximation. Thermodynamically, we expect this ap- 
proximation to be valid for DNA sequences of several 
thousand bases or less. This is due to the strongly co- 
operative nature of bubble formation, as manifested in 
the large initiation energy 71. The single bubble de- 
scription of dynamics is further restricted by the flnite 
life time of the bubble: Even at length scales where 
the single-bubble approximation is appropriate thermo- 
dynamically, the bubble may annihilate and reappear 
elsewhere in the sequence, effectively performing long- 
distance hops. Experimental knowledge of the bubble 
life time in the presence of an applied twist is needed to 
estimate the crossover time to the long-distance hopping 
regime. Qualitatively, we expect these bubbles to have 
much longer life times than the thermally denatured bub- 
bles, since the applied twist plays the role of an energy 
barrier preventing bubble annihilation. 

Finally, we note that bubble localization characterized 
in this study is a reflection of the statistical background 
present in long random nucleotide sequences. This back- 
ground traps the bubble kinetically if the bubble size be- 
comes sufficiently large. Thus, to localize denaturation 
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bubbles at appropriate locations specified by designed 
sequences (e.g., promoters or replication origins) for bi- 
ological functions, it is necessary to operate away from 
the localized regime, i.e., below the onset of localization. 
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